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We recover piecewise constant signals from noisy measurements / by the minimization of the 
L'-Potts functional y||Vw||o + \\u - We present an 0(n 2 ) time and 0(n) space algorithm for the 
computation of an exact minimizer of this non-convex optimization problem. We show that our algo- 
rithm recovers mildly blurred piecewise constant signals without knowledge of the blurring operator. 
For stronger blurred signals and known blurring operator, we derive an iterative reconstruction algo- 
rithm. 

1 Introduction 

Piecewise constant signals appear naturally in various fields of applied sciences, for example, cross-hybridization 
of DNA [14, 22] or reconstruction of brain stimuli [29]. The goal is to recover the true piecewise constant signal 
from a set of noisy measurements; for instance, the reconstruction from an incomplete frequency spectrum has 
been investigated in [7, 25]. Here, we deal with the reconstruction of piecewise constant signals from noisy and 
blurred spatial measurements. 

The reconstruction of such signals is an inverse problem which is typically tackled by minimization of an 
energy functional. The energy functional reflects a tradeoff between data fidelity and regularity. While classical 
Tykhonov approaches favor oversmoothed solutions and are not robust to non-Gaussian noise, several functionals 
have been proposed to overcome these limitations, see e.g. [26]. In recent years, L'-TV and related minimization 
problems have gained a lot of interest in the literature and have been applied to the reconstruction of signals with 
jump singularities, see e.g. [9, 8, 18, 31] and the references therein. In its simplest form, the L'-TV problem 
reads 



The difference between the argument u and the observed data / is measured by the L'-norm. The regularization 
term ||V«||i favors solutions of low total variation where the influence of the regularization is controlled by the 
parameter y > 0. Due to the non differentiability of the L'-norm, the TV-L 1 problem is mathematically and 
computationally more complex than the total variation minimization with an L 2 data term. However, it has been 
observed that the L 1 -data fidelity is more robust to non-Gaussian noise such as salt and pepper noise (cf. [9] and 
the references therein). In presence of Gaussian noise, the results of both data terms are comparable. 

Here, we aim at reconstructing piecewise constant signals, thus the regularization penalty shall enforce this a 
priori assumption. The natural penalty for piecewise constant solutions is the jump count of u. This leads to the 
L' -Potts minimization problem 



Note that the slightly abusive but appealing norm notation || ■ ||o has been popularised by its use in the sparsity 
community, see for instance [15]. In contrast to the TV regularity term, the jump penalty is non-convex, which 



y||Vw||, + \\u-f\\i ^min. 



(L'-TV) 




(L'-Potts) 



1 



128 256 




128 256 




128 256 



Figure 1: Top: Piecewise constant signal blurred by moving average of length 7 and corrupted by Laplacian (first column) Gaussian noise 
(second column), both of standard deviation <x = 0.1. The third signal is salt and pepper noise where 25% of the data is destroyed. 
Bottom: For Laplacian noise (left) and salt and pepper noise (right), the signal is almost perfectly recovered by the minimizer of 
(L'-Potts) (y = 1, ground truth as dashed line.) Under Gaussian noise (central column), the plateau heights deviate slightly, but 
jump locations are almost perfectly detected. 



makes standard convex optimization techniques unfeasible. In general, minimizers of (L'-Potts) are not unique. 
The possible reasons for non-uniqueness are discussed in Section 2. 

Replacing the data term \\u - f\\i in (L'-Potts) by \\u - /||?, one gets the classical 1? -Potts functional 

y ■ J(u) + \ Ui - M 2 . (L 2 -Potts) 

The investigation of (L 2 -Potts) can be traced back to R. Potts [27] in 1952 and even to E. Ising [23] in 1925 in 
a physical context. S. Geman and D. Geman [20] were the first to consider such functionals in the context of 
image processing. For a thorough account on the history we refer to [29] and the references therein. More recent 
results can be found in [30, 4]. Higher dimensional Potts functionals are computationally infeasable [2], but can 
be tackled by restricting the set of admissible partitions. One approach in that direction are wedgelets [13]. 

As with TV minimization, we see in this paper that the L'-Potts functional is more robust to non Gaussian 
noise than the L 2 variant, and that it yields comparable results for Gaussian noise. As a tradeoff, the L' data 
term is mathematically and algorithmically much more challenging than the L 2 variant. However, we shall see in 
this work that solutions of the L'-Potts functional can be obtained as fast as those of the L 2 variant, answering a 
question of V. Liebscher [24]. In fact, we present a fast algorithm to exactly solve (L'-Potts) in 0(n 2 ) time and 
0(n) space. The minimization technique for (L'-Potts) is of combinatorial type. 

Besides its denoising property, the L'-Potts has remarkable blind deconvolution capabilities. In fact, we see 
in this paper that the L'-Potts functional is able to recover piecewise constant signals from mildly blurred mea- 
surements without knowledge of the blurring operator. This is in contrast to the L'-TV minimization, which only 
performs well when the convolution kernel is known; see Figure 7. 

For reconstruction of severely blurred signals, we consider the related deconvolution problem 

y ■ J(u) + \\K * u - /Hj -> min . (K-L'-Potts) 

Here our fast algorithm to solve (1) is employed as basic building block of a (heuristic) iterative algorithm; see 
Figure 9. 

1.1. Related work and contribution A fast algorithm for (L 2 -Potts) has been introduced by V. Liebscher and 
G. Winkler and can be found in [17]. This algorithm computes an exact minimizer in 0(n 2 ) time and O(n) space, 
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where n denotes the length of the discrete data. In their paper, they also consider the discrete L 1 -Potts model. 
Using a red-black tree approach they derive an 0(n 2 log n) time algorithm to solve (L 1 -Potts) which beats the time 
complexity of a naive implementation by a factor of n. However, this comes with 0(n 2 ) space consumption, thus 
losing a factor of n against the naive implementation. V. Liebscher [24] posed the question whether it is possible 
to minimize the L 1 -Potts functional as fast as the classical L 2 version. In this paper, we give an affirmative answer. 

Theorem A. The proposed L 1 -Potts algorithm computes an exact minimizer of the discrete (unweighted) L 1 - 
Potts functional within 0(n 2 ) time and O(n) space. 

The key to this speed-up is to work with suitable data structures. Besides this asymptotic result, the actual 
runtime of the L 1 algorithm is less than 20% slower than that of the L 2 version, for large data size. Furthermore, 
the algorithm copes with weighted data fidelities, in a comparable runtime. These weights arise naturally in 
applications where data is available only on a non-uniform grid [28]. In addition, they result from discretization 
of the continuous Potts model by non-equidistant sampling. 

The continuous L l -Potts functional P y acting on piecewise constant signals u on the unit interval is given by 

/>» = y7( M ) + ||H~/|| 1 . (1) 

The function / on [0, 1] represents data and is assumed to be integrable. The minimizers of the L 1 -Potts functional 
(1) can be considered as the best piecewise constant approximation with respect to the L'-norm and a minimal 
number of jumps (or equivalently of pieces). Obviously, the above L 1 -Potts functional is associated with /, but 
we omit this dependence in our notation. 

We consider certain discretizations P k y (defined in (7)) of the continuous Potts model (1). These discretizations 
result in discrete weighted L 1 -Potts problems which can be tackled by our algorithm. We obtain the following 
convergence results. 

Theorem B. The functionals P k y T-converge to the continuous L 1 -Potts functional P y as the sampling density 
approaches 0. Furthermore, each accumulation point of a sequence of minimizers of the discrete approximations 
is a minimizer of the continuous model. 

F-convergence and convergence of minimizers in relation with the L 2 -Potts functional are topics of [4], [5]. 
In these papers, T-convergence is shown for a modified L 2 -Potts functional, which is used in turn to obtain a 
convergence statement for the minimizers of the discrete approximations as in Theorem B. This approach relies 
on the Hilbert space structure of L 2 , and therefore does not carry over to the L 1 context. 

We furthermore observe that the L 1 -Potts functional is able to perform blind deconvolution for mildly blurred 
data. In fact, if data / = K * g is given by a blurring of the piecewise constant signal g using a convolution kernel 
K, we can show the following assertion. 

Theorem C. For data of the form / = K * g, there are Potts parameters y and a maximal size of the kernel's 
support such that g is the unique minimizer of the Potts functional P y . If / is sampled sufficiently dense, the 
discrete Potts functional yields a solution which equals g up to shift of jump locations of at most half the sampling 
density. As a consequence, these solutions approach g as the sampling becomes finer. 

The precise statement is formulated as Theorem 1 1 and Theorem 12. Experiments show that the deconvolution 
property persists also in the presence of various types of noise; see Figure 1 . 

If the support of the kernel K involved is too large then the L 1 -Potts functional does not recover intrinsically 
the true signal g from the blurred data / = K*g. In this case, we tackle the deconvolution problem by minimizing 
the functional (K-L 1 -Potts). Here, we have to assume that the kernel K is known. Theoretical properties of the 
continuous l? variant of (K-L 1 -Potts) have been investigated in [3]. However, to the best of our knowledge, no 
algorithm to solve either (K-L 1 -Potts) or the corresponding L 2 -variant has been proposed yet. Here, we propose 
a heuristic to solve (K-L 1 -Potts). To this end, we introduce a new variable v in (K-L 1 -Potts) to obtain the splitting 

yj(u) + n\\u - v|h + \\K * v - /||i -> min . 

If we fix v, then the bivariate functional reduces to the Potts functional with respect to u. For fixed u, we get a 
convex K-L x -L x optimization problem in v, i.e., K-L 1 data term and L 1 regularization penalty. We approach this 
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bivariate functional by solving alternately 

■yJ(u) + fi\\u - v||i — > min and fi\\u — v|| 1 + \\K* v — f\\\ — > min (2) 

by our fast algorithm and by a standard method for the K-l}-L l minimization [1], respectively. Experiments 
confirming the power of this heuristic can be found in Figure 9. 

Let us summarize how to treat deconvolution problems for piecewise constant signals and known kernel K. For 
arbitrary (known) kernels, we can minimize (K-L 1 -Potts) and, for narrowly supported kernels, we can minimize 
(L 1 -Potts). So it appears at first sight that we have proposed two different approaches in the latter case. However, 
a closer look at (2) shows that both approaches are intimately connected. Indeed, initializing v by the data /, we 
observe that the first step of our iterative algorithm minimizes the ordinary L 1 -Potts functional. Thus, if / and 
K are as in Theorem C and if the parameters are chosen appropriately, this first L 1 -Potts step yields a solution u 
which equals g (up to a small shift of jump locations). This solution u is not changed by the K-L l -L l step and 
the further iterations. Hence the iterative algorithm reduces to solving the ordinary Potts problem (L 1 -Potts). So 
whenever the kernel is known, one should use our iterative algorithm (2). 



1.2. Organisation of the article The structure of our article is as follows. In Section 2 we deal with the F 
convergence of the discrete time L 1 -Potts functionals to their continuous time counterparts. We prove Theorem 
A, which is formulated as Theorem 5. In Section 3 we present our fast algorithm to solve the discrete L 1 -Potts 
problem (L 1 -Potts). We prove the complexity statement Theorem A as a special case of Theorem 7 which, more 
generally, treats the case of unequal weights. We close Section 3 with a comparison of the L 1 -Potts algorithm with 
the L 2 -Potts algorithm of [17] and the L'-TV algorithm of [9] under various types of noise. In Section 4 we show 
Theorem C, which is a statement on blind deconvolution of mildly blurred data. Its precise formulation is given 
in Theorem 11 and Theorem 12. We underpin our findings by experiments. In Section 5 we present a heuristic 
algorithm to solve the L 1 -Potts deconvolution K-L 1 -Potts problem and compare it with L'-TV deconvolution. 



2 Continuous Potts model and F-convergence 

In this section, we consider the continuous L 1 -Potts functional defined for u e L 1 [0, 1] by 

/y •/(«) + ||a- /Ik , if«ePC[0,l] 

Py(U) =i (3) 

loo, else. 

We start this section by providing some basics on L 1 -Potts functionals emphasizing the non-uniqueness of min- 
imizers. Then we consider discretizations of the Potts functional. We show their T-convergence with respect to 
the L'-norm towards the continuous Potts functional and obtain corresponding statements for the convergence of 
minimizers. To this end, we introduce in Section 2.2 the sequence of semi-discrete Potts functionals Q k y given by 

n x< s-\y J ^ + II" - /Hi ' if 11 6 PC *[°' ^ ™ 

\ly\ U ) — \ , V*) 

loo, else. 

Here, PC A [0, 1] denotes the space of those piecewise constant functions which only jump in a finite set X k . The 
discretization sets X k c X k+l form a nested sequence such that their union is dense in [0, 1]. As an intermediate 
step we show in Theorem 4 the T-convergence of these semi-discrete functionals towards the continuous Potts 
functional (3). We furthermore obtain that all minimizers of all the semi-discrete functionals are contained in 
one single compact set. 

The auxiliary functionals Q k are still defined by continuous data /. However, in practice, data is given by 
discrete samples 

/ = S k f 

obtained from / on some level k by the sampling operator St- One implementation of St might be the integral 
sampling 

f(j) = S k fiJ) = -f- I fdA (5) 
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where Ij are the intervals given by the discretization sets X^. If f is continuous, one might also consider point 
sampling 



fU) = S k f(j) = f(xj) 



(6) 



where Xj might be taken as the midpoint of the interval Ij. Then, the (fully) discrete Potts functionals read 



In Section 2.3 we use the result on the semi-discrete Potts functional of Section 2.2 to obtain the analogous 
statements for the fully discrete case, which is formulated as Theorem 5. 

2.1. Basics on L 1 -Potts functionals It is well known that a median minimizes the L' error for a constant 
approximation, that is, 



Thus, in order to find a minimizer u of the L'-Potts problem, it is sufficient to find its jump locations x,. Then, a 
corresponding minimizer is given by u — fJ-il(x t ,x M )> where yu,- is a median of the data / on the interval (x,, x,+i ). 
Recall that a median fi of a real-valued function / with respect to some finite measure v on Q is characterized by 
the property 



In our case, we use weighted counting measures in the discrete case and the Lebesgue measure on the interval 
in the continuous case. If / is a continuous function then there is only one unique median, whereas in general, 
there might be a whole interval of medians. 

This non-uniqueness of the median is one source of non-uniqueness of minimizers of the (discrete and contin- 
uous) L'-Potts functional. Besides this, there are two other sources of non-uniqueness as it can be seen by the 
following example. 

Example 1. To see the three sources of non-uniqueness of minimizers of the L'-Potts functional, we consider 
the data / = 1 [0 i-j + lj-i i j on the interval [0, 1]. 

First, for y > 1, the jump penalty is so high that a minimizer of the L'-Potts functional for / is a constant 
function on the interval [0, 1]. But all constant minimizer candidates u a (x) = a, where a e [0, 1], have a Potts 
value of | . So here, the reason for a non-unique minimizer is the non-uniqueness of the median. Second, for 
y = i, we easily check that the three-jump solution / and the one-jump solution 1 [0 i j have the same minimal 

Potts-value. Thus in that case, there is non-uniqueness with respect to the number of jumps. At last, if | < y < \, 
both Mi = 1 [0 ij and 112 = 1[ |] are minimizers of the L'-Potts functional. To see this, we first check that u\ is 
indeed a solution of the L'-Potts problem (1). As uo has one jump its Potts value is given by 



A constant function cannot be a minimizer since its data term would amount to at least I . So a better Potts value 
can only be achieved by introducing extra jumps. However, we observe that introducing only one extra jump does 
not decrease the data term. Two extra jumps do decrease the data term to 0, but then, we have a jump penalty of 
3y > \+y which is worse than the Potts value P y {u\). Hence, u\ is a solution of the L'-Potts problem, and, as 
Py(ui) = P y (u2), so does ii2- So here, the minimizer is not unique with respect to the partitions. 

Note that the first type of non-uniqueness is a special property of the best approximation with respect to the 
L 1 norm and does not occur in case of the L 2 -Potts functional. The other two types of non-uniqueness also occur 
in the L 2 case but have been shown to occur only on a negligible set of data / for the discrete setting [30]. We 
strongly conjecture the analogous statements can be made for the L'-Potts functional. However, a rigorous proof 
is out of the scope of this article. 




(7) 



ll*-A*lli<ll*-;y||i, forallyeR. 



(8) 



v({x : f(x) < A) < iv(Q) and v({x : /(x) > fx}) < ±v(Q). 



(9) 



Py(uy) = yJ(ui) + \\m - /||i = y + \ < \. 
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2.2. T convergence of semi-discrete Potts functionals In the following, we show the F-convergence of the 
semi-discrete Potts functionals (4) and a convergence statement for corresponding minimizers. 

We recall that a sequence F k of functionals on L'[0, 1] taking values in the extended positive real line [0, oo], 
T-converges to a functional if 

(i) for each u e L'[0, 1] and each sequence {i/j^N in L l [Q, 1] with u k — > u as k — > oo holds 

F°°(k) < liminf^oo F k (u k ), (10) 

(ii) and for each u e L l [0, 1] there is a recovery sequence {u k }kes in L 1 [0, 1] with u k — > u as k — > oo such that 

F"( M )>limsup^ MJ FV). (11) 
We start out showing (10) for the semi-discrete Potts functionals Q k y . 

Lemma 2. Let u e L l [0, 1] ant/ Zef fee a sequence in L l [0, 1] s«cn that \\u k — u\\\ — > /or £ — > oo. Then 

P y (u) <limmf k^Q^iu 11 ). (12) 

Proof. We may restrict ourselves to sequences u k in PC[0, 1]. We first consider u e L 1 which is not essentially 
bounded, i.e., u i L°°. We show that, for each sequence of piecewise constant functions u k converging to u, the 
number of jumps tends to infinity, which then yields (12) for essentially unbounded /. To this end, we construct 
a sequence of disjoint intervals /, and a sequence of positive numbers e, for which we show the following. For 
an approximating piecewise constant function?/, \\u- h||i < e, implies that 7/ takes values in /,. 

Since u is essentially unbounded, the Lebesgue measure A({\u\ > n}) is positive for any positive integer n. 
We choose n\ — 1 and let E\ - A(\\u\ > 2n 1 })/2. Using the Markov inequality, we find m\ > n\ such that 
M{\u\ > nil}) < S\. This implies A{{2n\ < \u\ < mi}) > e\. If ~u does not take a value in the interval l\ = [n\,2m\\, 
then we can estimate its approximation error by 

||m - u\\ x > — K)|{ 2 „,< ie s OTl }|| > min(m u n x ) ■ A({2n x <u< m x }) > s,. 

We choose «2 = 2m; + 1 and let S2 = A({\u\ > 2«2))/2. We proceed likewise to obtain intervals and positive 
numbers e,- with the claimed properties. Then, if u k converges to u, there is some index k t such that \\u l - u\\ < e, 
for all / > kj. Each such piecewise constant function u l must take values in each of the intervals Thus 
the number of jumps of the sequence u k tends to infinity. 

Next, we consider essentially bounded u. We show that, if u k tends to u and the number of jumps of the u k is 
(uniformly) bounded, then u already is piecewise constant (i.e., it has a piecewise constant representative). Thus, 
if // is not piecewise constant, the number of jumps of a sequence u k converging towards u must go to infinity 
which yields (12) for bound u which is not piecewise constant. 

In a first step, we construct a sequence m* which converges to u and which is uniformly bounded (w.r.t. the 
sup-norm) as follows. We define the piecewice constant function m* by u k except for the intervals on which 
\u k \ > 2HMHOO. Here we let = 0. Thenii* is uniformly bounded by 2||m||oo and ||m* - u\\\ < \\u k — u\\\. This entails 
that m* converges to u. 

So we may assume that ur tends to u in L , the piecewise constant functions u k are uniformly bounded (w.r.t. 
the sup-norm) and the number of jumps of the u k is bounded as well. But then the piecewise Constance of u is a 
consequence of the compactness of the set 

Mcj = [g = a < Via) ■0 = Xo<...<Xj=l, \a t \ < c} . (13) 

of those piecewise constant functions having at most j jumps and which are bounded by the constant C (w.r.t. 
the sup-norm). In fact, Mcj is the image of the compact set 

A = [(x, a) e R^ 2 xR j : = x < . . . < Xj = I, \a t \ < c) . 
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under the continuous mapping e(x, a) = £/ =1 a/ l( xt _ uXt )- The mapping e is continuous since \\e(x + E\,a + £2) - 
e(x,a)||i <2C||ei||i +||fia|U, for small enough s\, £2. 

So far, we have shown (12) for non piecewise constant u. Now we consider a piecewise constant function u 
and show (12) in that case. We start by showing the following implication for any piecewise constant function w 

J(w)<J(u) => \\w-u\\ y >\h-l. (14) 

Here h is the minimum jump height and I the minimum interval width of the piecewise constant function u = 

It is sufficient to prove (14) for functions w whose jump set is included in the jump set of u. To see this, we 
consider a general w with less jumps than u and shift its jump locations to those of u such that the resulting 
w' fulfills J(w) > J(w') and ||w - u\\\ < \\w' - u\\\. To see this, let X = {x\, ...,x„} be the jump set of u and 
let T = [t\, t r ) be the jump set of w. Denote S — T \ X the jump locations of w which are not in the jump 
locations of u. We reduce S to by the following recursive procedure. If T c X we are done. Otherwise we pick 
a jump point f, which is not contained in X. Then we consider the difference between u and w to the left of , i.e., 
\u(tt) - w(tr)\ and the difference to the right \u(t t ) - w(tf)\. If \u(t{) - w(tr)\ > \u(tj) - w(tt)\ then we shift f; to the 
left until we meet the next jump point p of u or w. If this point is in X \ T, say p equals xi, then w' reads as 

We have J(w') - J(w), ||w - u\\\ > \\w' - u\\\, and w' has one more jump in the jump set of u than w. If p is in T, 
the procedure reduces the number of jumps of w by 1 and still \\w - u\\\ > \\w' — u\\\. In both cases, the number of 
jumps S which are not in X is reduced by one. In the opposite case, |w(?,) - w(t~)\ < \u(f{) - w(tf)\, we shift to the 
right and proceed analogously. Iterating this algorithm, we obtain a w' whose jumps set is contained in that of u. 
By construction, this W still has less jumps than u, i.e., J(w') < J(u), and ||w - u\\\ > \\w' — u\\\. 

In consequence, we find an interval (?, , f !+ i) where w' does not jump but u does so at f, < p < f, + i. Then 

\\w' - m||i > \u(p + ) - u(p~)\ ■ min(p - t h t M - p) > \hl, (15) 

which shows (14). As a direct consequence of (14) we find for each sequence of piecewise constant functions u k 
converging to u in L l an index ko such that for all k > ko 

J(u k ) > J(u). (16) 

Equipped with this estimate we show (12) for piecewise constant functions u. Since Q*( V ) ^ooforv^PC^fO,!] 
we may assume without loss of generality that is in PC*[0, 1]. Then P y (ut) = Qy(ut) and thus, using (16), 

Py(u) = yJ(u) + \\u- f\\\ < liminf yJ(iik) + \\m\\uk - f\\\ = liminfP y («i) = liminf Q^(uk), (17) 

k k k k 

which completes the proof. □ 

Next we show the second condition for T-convergence (11). 

Lemma 3. For each u in L l [0, 1] there is a recovery sequence u k which converges to u in L l and which fulfills 

P y (u)>\imsup k ^Q 7 (u k ). (18) 

Proof. We may restrict u to PC[0, 1], since otherwise P y (u) = 00. For k e N we define as the largest gap in X k , 
i.e., = sup ; |jc; - Xi+\\. Since the union of discretization sets X k is dense in [0, 1], the sequence q converges to 
as k — > 00. 

Let us fix k for the moment. For every jump location f, of u we find a closest discretization point Xj such that 
\xj - tj\ < ek- Then, for sufficiently large k, replacing each interval (f,, f,+i) in u — Yii a i^(tt,t M ) tne interval 
(x,-, we obtain a piecewise constant function u k such that 

\\u - m*||i <e k - H ■ J(u). 
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Here, H is the maximal jump height of u. Obviously, u and u have the same number of jumps and, therefore, 

Pyiu) = yJ{u) + \\u - f\\ x > yj(u k ) + ||/ - - \\u k - m||! > P y (u k ) -e k H- /(h). 

Rewriting this as P y {u) + e k ■ H • J(u) > P y (u k ) = Q k (u k ) and passing to the limit k — > oo yields the assertion. □ 

Equipped with these preparations, we formulate our main result on the semidiscrete Potts functionals Qz. 

Theorem 4. Let f be an integrable function on [0, 1] and let Py be the corresponding continuous L 1 -Potts 
functional. Then the semi-discrete Potts functionals Q k y with respect to a nested sequence of discretization sets X k 
F-converge to P y . Furthermore, each functional Q k has a minimizer. Each sequence u k , where u k is a minimizer 
of Q k , has at least one accumulation point. Each such accumulation point u is a minimizer of P y , i.e., 

Py(u) = inf veil P 7 (v). (19) 

Proof. Our first aim is to show that all the minimizers of the functionals Q y are contained in a compact subset M 
of L l [0, 1] which is independent of k. We let M = Mcj be the compact set from Equation 13, where we define the 
maximal number of jumps j by the smallest integer larger than ||/||i/y. The constant C is defined in the course 
of the proof. To show that all the minimizers are contained in M, we first prove that 

MQ k Ju)= inf Qy(u) < Q y (v) for all v i M. (20) 

ueM 7 «ePC[0,l] 7 7 

To this end, we consider an arbitrary piecewise constant function u i M and show that there is a piecewise 
constant function ~u contained in M with lower Potts value. Because of the minimizing property (8) of the 
median, we may assume that u = £j a,l( A . f , AV+1 ) is given by the median fij on each interval / = (x,, Xt+i). We have 
the following connection between the absolute value of the median p.j on some interval / and the length of the 
interval |/| 

\fi[\>c => |/| < ^ for any c> 0. (21) 

Indeed, > c implies that A({\f\ > c}) > \I\/2, and the Markov inequality yields A({\f\ > c}) < ||/||i/c. 
Furthermore, if the interval length |/| is small, the integral of |/| on / is small, i.e., for s > there is 6 > such 
that 

\I\<6 => Jl/I^ e ' (2 V 

This implication can be seen by considering the finite measure \f\dA with density |/| with respect to the Lebesgue 
measure A. The corresponding distribution function is absolutely continuous and thus uniformly continuous 
which is precisely the statement of (22). Combining (21) and (22) we find for s > a constant C > such 
that 

IjU^I > C => JV| < e for any interval /. (23) 

Now we define the constant C in M C; as the resulting number C in (23) for the choice s = y/6. If C < 3||/||i we 
enlarge it such that C > 3||/||i. 

We come back to our discussion on the piecewise constant function u which is given by the median pi on each 
interval / = (x,-,x !+ i) and which is not in Mcj. If the number of jumps of u obeys J(u) > j > \\f\\\/y then u = 
has a smaller Potts functional value since 

P k (u) > yj(u) > ||/|h > P 7 (0) > inf VEPC[ o,i] P k y(v). (24) 

Therefore, we may assume that u has at most j jumps. We observe that there is at least one interval / = (x,-, xt+i) 
where \u\ < C. If this was not the case, we would have A({\f\ > C}) > 1/2. This contradicts the definition of 
C > 3||/||i since A({\f\ > C}) < \\f\\\/C < 1/3. We start with an interval / where \u\ < C and define u = u on 
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/. Then we consider a (left or right) neighbor interval J. If \u\ < C on J, we define 77 = u on J and proceed with 
a neighbor of / or J. If \u\ > C on J, we define ~u\j = u\i, i.e., we remove a jump. We have J(u) = J(u) - 1. 
Furthermore, \\u\j\\i < since \u\j\ < C < \u\j\. Also, ||«|y||i < 2||/|j||i because the value of u on J is the 

median of /on J and thus \\u\j\h - \\f\j\h < \\(f-u)\,\\ < \\f\j\\i. By (23), \\(u-f)\j\U < N/lli +II/I/II1 < < Wbh 
< y/2. In consequence, 

P y (u) = y/(77) + II/- will < yJ(u) - y + \\f - u\\ x + y/2 < P y (u). (25) 

Proceeding likewise for the other intervals, we obtain a piecewise constant function 77 which is smaller than C in 
absolute value and thus is contained in Mcj- Furthermore, P y (u) < P y (u). Since P y = Q y for piecewise constant 
functions which only jump in the discretization set and since the jump set of 77 is contained in the jump set of 
//. (25) remains true when we replace P y by Q k y . This shows (20). 

We are now able to prove the assertions of the theorem. Lemma 2 and Lemma 3 imply the F-convergence of 
Q k y to P y . Since (20) holds, we deduce from the lower semicontinuity of Q k y (which is a consequence of (17)) 
and the compactness of M that each Q k y has a minimizer which then is in M. Furthermore, the compactness of 
M implies the existence of a cluster point for any sequence of minimizers u k . Then (19) is a consequence of [6, 
Theorem 1.21, p. 29]. □ 

2.3. T convergence of discrete Potts functionals In this part, we use the results on the semi-discrete Potts 
functional to obtain F-convergence with respect to the L'-norm for the fully discrete Potts functionals defined by 
(7). We prove the following theorem. 



Theorem 5. Let f be an integrable function and let P y be the corresponding continuous L -Potts functional. 

y, Licji/itLt uy (7y i -x^uilvci iw 1 y . 



Then the discrete L 1 -Potts functionals P k defined by (7) T-converge to P y . Each sequence Uk, where Uk is a 



minimizer ofPz, has at least one accumulation point. Each such accumulation point u is a minimizer of P y , i.e., 

P y (u) = M veL i [0 ,i] P y (v). (26) 

Proof. We start showing the statement on T-convergence. We first establish a quantitative relation between P k 
and Q y . We may estimate 



J*(i0=y. J(ii) + ||ii-/||i 

< y ■ /(«) + ||h - Yj.S k f(j) ■ l I} h + 11/ - Yj.SkfU) ■ hjh 

= Qy(M) + II/- Yj.S k f(f) -l/.lh. (27) 

If S), is given by the integral sampling (5) we have the following inequality 

11/ - Yjj Sk ^ ■ Mi ^ c ■ su pj I 7 /!) < 28 > 

where C is a positive constant and a> is the L 1 modulus of continuity given by a>(f, t) = sup^^, ||/(« +h) — /||i, cf. 
[12]. Since the translation is continuous in L l and the union of the nested sampling sets X k is dense, it follows by 
(28) that ||/ - Yjj S kfif) ' l/,lli — > 0, as A: — > 00. Now exchanging the roles of P k and Q k y in (27) we conclude that 

\P k y (u) - Q k y (u)\ < ||/ - Yjj^fif) ■ hjh ^0, as k -» 00. (29) 

We emphasize that the above convergence is uniform in u. If we consider the point sampling operator (6) and a 
continuous function /, (29) remains true. This is because 

11/ " X S k fU)hjh < sup sup \f(x) - f(y)\ (30) 

tends to zero as k increases since / is uniformly continuous on [0, 1] and the maximum interval length \Ij\ goes 
to zero since the union of the nested sequence X k is dense. 
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In order to show F-convergence, we show (10) and (11) for the discrete Potts functionals. Let us consider a 
sequence Uk which converges to u in L 1 . Then P y (u k ) > Q k {u k ) +\P y (u k ) - Qz(u k )\. This, together with (12), 
implies in turn that 

P 7 (u) < liminf^ Q k (u k ) < liming P k (u k ) + limsup,^ sup„ £ , \P n y (u n ) - Q y (u")\. (31) 

Since the convergence in (29) is uniform in u the second summand on the right hand side is which shows (10). 
In order to obtain a recovery sequence for the discrete Potts functionals P k , we consider a recovery sequence Uk 
for the semidiscrete Potts functionals Q y , which exists according to Lemma 3. Then, by (18), 

P y (u) > lim sup,_ Q k y (u k ) > lim su P/Moo P k y {u k ) - lim sup t _ sup„ £t \P n y (u n ) - Q n y {u n )\. (32) 

As above, the second summand on the right hand side is and so mj. is a recovery sequence for P k , too. This 
shows (11) and we have shown F-convergence. 

Our next goal is to locate all minimizers of Pz in a common compact set M. To this end, we show the analogous 
statement to (20) replacing the semidiscrete Potts functional by their discrete counterparts, i.e., 

inf P k = inf P k < P k Jv) for all v $ M. (33) 

ueM ' «ePC[0,l] 7 ' 

The proof is based on the corresponding proof of (20) which can be found in the proof of Theorem 4. 

We first consider the case where S k is the integral sampling operator (5). We choose the compact set M = Mcj 
with C and j as in the proof of Theorem 4. For/ = 2/^^/(0 ' l//> we have \\fk\\\ < H/lli- So, similarly to (21), 
the following implication holds for any interval / : 

W.f k )\>c |/|<^ for any c> 0. (34) 

Here ///(/„) is the median of f„ on /. Furthermore, for e > there is S > such that 

\I\<6 fj\fk\<s (35) 

for all intervals / whose endpoints are in the discretization set X^. This is a consequence of (22) and the fact that 
Jj \fk\ < Jj l/l- Now combining (34) and (35) yields a statement analogous to (23): For e > there is C > such 
that 

N>C =^ fj\fk\<s (36) 

for any k and any interval / whose endpoints are in the discretization set X^. Now we can consider a piecewise 
constant function u whose jump set is contained in X* and proceed as in the proof of Theorem 4 to obtain a 
piecewise constant function ~u whose jump set is contained in X^, which has smaller Pz value, and which is 
contained in Mcj. This in turn implies (33). 

If S k is the point sampling operator (6) and if /is continuous, we have \\fk\\i < WfkWoa < ll/IU- Then (21) applied 
to fk yields the implication (34) with ||/||i replaced by II/IL- In this case (35) is trivial since \fk\ < ll/IU ■ |/|. As 
in the case of integral sampling we use these two implications to establish (36) which allows us to proceed as in 
the proof of Theorem 4 to obtain (33). 

Equipped with (33) and knowing that each P k y has a minimizer (see Theorem 7) we conclude that all minimizers 
are contained in M — Mcj- The compactness of M implies the existence of a cluster point for any sequence of 
minimizers u k . Then (19) is again a consequence of [6, Theorem 1.21, p. 29]. □ 

3 A new fast algorithm for the exact minimization of the L'-Potts functional 

In this section we present a fast algorithm to solve the discrete L'-Potts functional. More precisely, we propose 
an algorithm with 0(n 2 ) time complexity and O(n) space requirement for the exact minimization of the discrete 
L'-Potts functional 

P y {u) = \\u - f\\ e i + yj(u) = \m - /I + r ' #{' : m< * u i+ i}. (37) 
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This improves a previous results of Friedrich et. al. [17] who proposed an 0(n 2 log ri) time and 0(n 2 ) space 
complexity algorithm for the minimization of (37). Compared to a naive implementation consuming 0(n 3 log«) 
time, the approach in [17] improves the time complexity by a factor n at the cost of worsening space complexity 
by a factor n to 0(n 2 ). In contrast, our algorithm retains the linear space complexity of the naive approach, 
improving time complexity to 0{n 2 ). This answers a question of V. Liebscher, who asked if a minimizer of the 
L'-Potts functional can be computed in the same complexity as the minimization of the L 2 -Potts functional [24]. 

We provide an efficient implementation (available under [1 1]) of the new L'-Potts algorithm whose runtime is 
only about 20% higher than that of the L 2 -Potts algorithm, for larger signals. Actually, the proposed algorithm 
minimizes the Potts functional with weighted data fidelity term 

II" - f\\ei + y J ( u ) = 2, Wi<[Ui ~ fi\ + y # (' : u i * M '+i' ~* ^ ( 38 ) 

where the w, are positive weights. Clearly, if all weights w, are identical this reduces to the non-weighted L'-Potts 
functional (37). The weighted ^'-data term is important if we want to incorporate non-equidistant sampling as in 
Section 2. 

In Section 3.1 we explain our algorithm and show the results on time and space complexity. In Section 
Section 3.2 we show experiments comparing L'-Potts, L 2 -Potts, and L'-TV minimization under various types of 
noise. 

3.1. The algorithm We first recall the Potts core algorithm as proposed in [17]. Then we turn to the L'-Potts 
problem. We introduce a suitable data structure which we call indexed linked histogram. We demonstrate that 
the data structure provides all features to derive a fast algorithm. Finally, we show that our algorithm produces 
a minimizer in 0(n 2 ) time and 0(n) space complexity. We underpin this theoretical results by a comparison of 
runtimes between L 1 and L 2 -Potts algorithm which show that the L'-Potts algorithm is only 20% slower than the 
L 2 -Potts algorithm. 

The Potts core algorithm The Potts algorithm computes a minimizer u r+l for data (/i, ...,/ r +i) provided the 
knowledge of a minimizer u r for the data ( f\, ...,/,-) of length r, a minimizer u r ~ l for the data ( f\, ...,/ r _i) of 
length r — 1, . . . , and a minimizer u l for the data (/]) of length 1. The basic idea is to compute a set of r + 1 
candidate vectors v l , I = 0, ...,r, along with their Potts functional values P y (v'). Then, a candidate with the 
smallest Potts value is a minimizer u r+l for the data (/i, ...,/ r+ i). This procedure works concretely as follows. 
We get the first candidate v° by appending the new data value f r+ \ to the (known) minimizer u r corresponding to 
the data (f\, ...,f r ). Since this introduces a new jump, the Potts functional of this candidate is larger than that of 
u r by the jump penalty y. Thus, for the first candidate we have 

v° = {J^, / r+ i) and P y (v°) = P y (u r ) + y. 

length r 

Likewise, we construct the second candidate v by taking the minimizer u of the data (f\, ...,/ r _i) and the best 
constant approximation [i e R of the remaining data (/ r ,/ r+ i). This constant approximation by ft introduces a 
data penalty. Hence, in the second iteration, the candidate reads 

v'=(^,A<,//) and P 7 (v°) = P 7 (u"- X ) +y + \\(f r , f r+l ) - Qu,/*)|| . 

kngth r-1 =d [r -i,fi 

This procedure is continued until we arrive at the last candidate v r . Here, we have 

/ - Ui', . . . , n') and P y {v<) = j|(/i,-,/ r+ i) -0i , ,...,/i , )|| , 

where // e R is the best constant approximation of the full data vector (/!, ...,/,+i). Since V is a constant vector, 
there is no jump penalty. At the end, a candidate with minimal Potts value is a minimizer of the augmented data 
(fx, ...,f r+ i). We get a minimizer of the full data (/!, ...,/„) by iterating this procedure for r from 1 to the data 
length n. 
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For the sake of simplicity, we assumed in the description of the algorithm, that a minimizer candidate has a 
jump at the interface of the precomputed solution for ( f\, ...,//) and the best piecewise constant approximation 
for (fi+i, If there is no jump in some step of the iteration, then the algorithm overestimates the Potts value 

P y in that step. However, in that case, the correct minimizing Potts value is attained in one of the subsequent 
iterations. 

Procedure findBestPartitionLl(y e R, / e R", w e W + ) e N" 
input : data vector /eR", weight vector w e R" . 
output: partition stored in p e N" 

local : left and right interval bounds I, r 6 N; Potts values Bo, B n e R; temporary values fceR; 

IndexedLinkedHistogram H; temporary value ((el; 
begin 

B := -r; 

H <— new IndexedLinkedHistogramf ) ; /* ink indexed linked histogram */ 

for r <— 1 to n do 
B r <— oo; 

H.insertElementf f r , w r ) 
H.L. resetTemporaryf ); 
d^O; 

for I <— 1 to r do 

b <— + y + d ; 
if b < B r then 

B r <- b ; 
p r <- I - 1 ; 

end 

/* Remove the node /? and weight w;, and update the median and the median deviation 
H.removeElementTempfl, w\) ; 

d < — Hrf ; I* set cunent deviation d to that from the histogram H */ 



I* insert element to histogram */ 
/* Reset temp values of the nodes of the SortedLinkedList L in H to the non-temp ones. */ 

/* init d */ 

/* compute candidate Potts value b */ 

/* if candidate Potts value b is smaller, update best Potts value B r */ 
I* update length of latest partition */ 



; / 



end 



end 



end 



Now let us look at the time and space complexity of the Potts core algorithm. We see from the recursive 
scheme that the core Potts algorithm is an 0(n 2 ) time algorithm. One can also achieve a memory consumption of 
O(n) by the following idea. It is not necessary to store the minimizer candidates as full vector. We only require 
to store the length of its last constant interval, say \s + 1, . . . , r}, and the value of a best constant approximation 
to the data (f s +i, . . . on this last interval. The first s components of u r are then given by the components of 
the minimizer u s for the data (/i , . . . , f s ) which is obtained recursively in the same way. So we have to store only 
an integer and a float for each k — \,...,n which is linear in n. This idea suggests a natural split of the Potts 
algorithm into two steps (procedure minLlPotts). 



Procedure minLlPotts(y e R + , / e R") e R" 
input : Potts parameter y > 0; data vector / e R"; 
output: minimizer of the L 1 -Potts functional «el" 
begin 

p <— findBestPartitionLl(y,f); 

it <— reconstructionFromPartitionL\(p, /); 

end 



First, the method findBestPartitionLl finds the optimal jump set of a minimizer. (The Potts core part is printed 
in plain font, whereas the specific L 1 part explained later on is in italics.) In a second step, the method recon- 
structionFromPartitionL 1 reconstructs the minimizer by setting the values between two jumps to the best constant 
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approximation. 



Procedure reconstructionFromPartitionLl(p e N", / e R") e R" 
input : data vector / 6 R"; partition stored in p e N" 
output: estimation / e R" 
begin 

r <— n; I <— p r ; 
while r > do 

/i <- Median((Ji+i, fi +2 , ■ ■ -Jr)) ; 
for f <— I + 1 to r do 

I /»<-/*; 

end 

r <— I; I <— /? r ; 

end 



The fast L 1 -Potts algorithm We derive a proper algorithm for solving the L 1 -Potts functional based on the 
skeleton given by the Potts core algorithm. We have seen in the preceeding paragraph that in order to do so, we 
have to provide, for each step r, (1 < r < n), a best constant approximation p to data on the discrete intervals 
{/,...,/-}, and the corresponding approximation error 

for each / = 1 , . . . , r. Since we consider an l l data fidelity term a corresponding best constant approximation p is 
given by a median of the data (_/}, . . . , f r ). When considering the weighted i 1 norm for the fidelity term, we have 
to provide the approximation errors 

d[i,r] = XL w r\»-fi\ (l<l<r<n) (40) 

and the medians p = py^ for data (//, . . . , f r ) w.r.t. the weights Wi, . . . , w r . 

Finding a median is achieved by sorting the data (//, . . . , f r ) and taking a central component of the sorted 
vector, i.e., the 0.5 quantile w.r.t. the (normalized) weights. The sorting of the data costs 0(n\ogn), hence a 
naive algorithm for the L l -Potts minimization would be of class 0(n 3 logn). We shall now see how this can be 
accelerated to an 0(n 2 ) time algorithm. 

The key ingredient is a suitable data structure which we call indexed linked histogram. Before explaining 
the data structure we emphasize that it has the following property. Given a sorting of the data (/;,... ,/ r ), a 
corresponding median py^}, and the approximation error d[i 7 r] from (40), the indexed linked histogram allows for 
computing a sorting of the reduced data (// + i, . . . ,/,), the corresponding median P[i + i^ r ], and the error <tf[;+i, r ] in 
constant time. Assume for the moment that we have this indexed linked histogram structure at hand. Then, if we 
start with a sorting of the data (ft,... ,f r ), we are able to compute all r median deviations c/[i, r ],... d[ rr ] in 0(n). 
Hence the (inner) Hoop of procedure findBestPartitionLl runs in linear time. It remains to provide the initial 
sorting of the data ( f\ , . . . , f r ) required for the inner Hoop in reasonable time. Here, we may exploit that we 
already have a sorting of the data (/i, . . . ,/ r -i) from the previous iteration. So we can achieve a sorting of the 
data (f\, . . . ,f r ) in linear time by naively inserting f r into the sorting of the data (f\, . . . ,/ r -i). The median P[i, r ), 
can be be computed from the sorting in O(n) and a naive computation of the deviation d[\ yr \ then costs O(n) time. 
Summing up , computing rf[i, r ], ■ ■ • , d[ V ] can be done in 0(n) time; hence letting r — 1, . . . , «, the overall time 
consumption for computing all the required median deviations d[i >r -\ is 0(n 2 ). 

It remains to provide the data structure indexed linked histogram having the properties postulated in the last 
paragraph; see the class diagram IndexedLinkedHistogram. To make the crucial points better understandable, we 
occasionally back the explanations by the data set 



Data/ 


2 


1 


3 


1 


3 


Weight w 


0.15 


0.25 


0.3 


0.2 


0.1 



(41) 
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Data/ 


2 


1 


3 1 3 


Weight w 


0.15 


0.25 


0.3 0.2 0.1 


Pointer A 


• 








prevTemp 



Data/ 


2 


13 13 


Weight w 


0.15 


0.25 0.3 0.2 0.1 


Pointer A 


• 





uTemp 



0.45 



0.45 



'next 



jrev 



2 


nexf\ 


0.15 




0.0 


prev/ 



prevTemp 



0.4 



0.4 



Figure 2: Illustration of the class IndexedLinkedHistogram. The shaded (blue) rectangles depict the HistNodes with the entries, value, 
weight, and weightTemp (from top to bottom). Left: The temporary pointers at the beginning of the (inner) Z-loop coincide 
with the non-temporary ones. Right: In the first /-iteration, the left-most weight w\ is removed from the temporary weight by 
removeElementTemp(l, wi). Since the weight amounts to 0, the node is removed from the temporary list. The temporary median 
pointer M is shifted accordingly and the deviation is updated by updateMedianDeviationTemp(). 



An illustration of an indexed linked list is given in Figure 2. 

Recall that a histogram associated with data / = (/,... ,/„,) and weights w,- is the image measure f(w) of the 
measure w given by the weights w, under the mapping /. Thus a histogram has the data values as arguments and 
assigns to each data value / the sum Ei:/ ; =/- w i °f me weights w, of all data points with value / . In the example, 



Data value 


1 


2 


3 


Total weight 


0.45 


0.15 


0.4 



(42) 



A central characteristic of the indexed linked histogram is that each pairing "value" t-» "weight", e.g., 2 i-> 0.15, 
is realized as a histogram node; see the class diagram HistNode. The extra integer variable count contains the 
current number of single data points associated with the HistNode. The histogram nodes are arranged in a doubly 
linked list sorted according to the data values, i.e., by the member variable value of HistNode. Recall that, in a 
doubly linked list, each node contains a pointer to its next element as well as a pointer to its previous element. 
This allows for the removal of a node in constant time, provided that a pointer to that node is available. The in- 
sertion of a new node N to the sorted linked list L is O(ri), and we denote that operation by L . insertSorted(N). 

Class HistNode 

properties 

prev, next e HistNode ; 
prevTemp, nextTemp e HistNode ; 
value e R ; 

weight, weightTemp e R ; 
count, countTemp e No ; 

end 



/* pointers to next and previous list elements */ 
/* temporary pointers to next and previous elements */ 
/* the histogram value */ 
/* the (temporary) total weight of the histogram value */ 
/* the (temporary) number of weights */ 



A second important point of the indexed linked histogram is the differentiation between temporary values and 
pointers, and their non-temporary counterparts. This double structure allows for removing temporarily nodes 
from the histogram and for restoring the non-temporary state in linear runtime. The temporary removal of a 
node N from the linked list L is denoted by L . removeTemp (N) and the reset of the temporary values to the 
non-temporary ones by L . resetTemp . 

The third ingredient for the IndexedLinkedHistogram is the indexing of the histogram nodes. During the 
algorithm, we require to remove weights from the histogram which are associated with some data point / of 
index /. To this end, we store a pointer of each data point to its corresponding histogram node in an array 
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(member variable A in IndexedLinkedHistogram). In the example, A4 stores a pointer to the histogram node of 
value 1. 

At last, we require auxiliary variables keeping track of the temporary median M, the temporary weights above 
and below the median W„ and Wb, and the current median deviation d. 

Class IndexedLinkedHistogram 



properties 

L e SortedLinkedList of HistNode 
M e HistNode ; 

A e Array of HistNode pointers; 
de R; 
W a ,W b eR; 
end 

methods 

insertElement(/ r , w r ) ; 
removeElementTemp(/ e N, w 6 R) 
updateMedianDeviationTempO ; 
end 



/* a sorted linked list */ 
/* temporary pointers */ 

/* current median deviation */ 
/* current weights above and below median */ 



/* see page 15 */ 
/* see page 16 */ 
/* see page 17 */ 



The runtime of the L l -Potts algorithm depends on how fast we may update the histogram. To get an 0(n 2 ) 
runtime, the histogram update must be 0(n) in the (outer) r-loop, and 0(1) in the (inner) /-loop of proce- 
dure findBestPartitionLl. Thus, the expensive part, i.e., the sorting of the histogram, must take place in the 
outer loop. In the inner loop, we just exploit this sorting done in the outer loop. Note that the non-temporary 
histogram structure is manipulated in the outer loop, whereas the inner loop works with the temporary values. 

Let us begin with the explanation of the outer loop. Assume that we are given an indexed linked histogram 
for the data ...,/ r ). In step r + 1, we first insert the new element / r+ j into the sorted list of hist nodes. This 
requires a recurrence of all histogram nodes, which is of runtime 0(n). The median, the median deviation and 
the weights above and below the median of the new data ...,/ r +i) can be computed by further iterations over 
the list in linear time. At last, since the inner loop works with the temporary histogram structure, the temporary 
node values have to be set to the non-temporary ones. This costs 0(n), as well. In summary, the update of the 
histogram in the outer loop is of 0(n) runtime. 



Method insertElement(/ r e R) at class IndexedLinkedHistogram 



begin 

N <- L.findNode( f r ) ; 
if N - null then 

N <— new HistNode(/ r , w r ) ; 
L.insertSorted(AO ; 

else 

j N count * N count 1 > weight * N weight W V 

end 

A r <— TV ; 

M <— Median node of L; 

d <— YjNeL Mveight ' lvalue ~ lvalue I ', 



I* find node of value f r in SortedLinkedList L (in O(n)) */ 



/* insert HistNode N to sorted list (in O(n)) */ 



/* store pointer to N in array */ 



A r value<M valne ^weight 



w.. 



I* compute current median deviation (in O(n)) */ 
A^vaiue ^weight \ I* set weight below and above median */ 



end 



Now let us turn to the (inner) /-loop. Here, we have to answer the following question: Given an indexed linked 

histogram of the data (//,..., / r ), how can we update the histogram for the reduced data (// + i f r ) in constant 

time? To update the histogram, we need to remove the weight wi associated with the value /; from the histogram 
(method removeElementTemp(/, w;)). But notice that the search for a histogram node is already O(n), so search- 
ing for a node is not an option here. At this point the indexing by the array A comes into play. We may access 
the cell A/ in constant time, which in turn stores the pointer to the node whose temporary weight we decrease 
by wi and whose temporary count we decrease by 1. If the temporary count of the node equals zero after the 
decrement, we remove it from the temporary list. Due to the linked list structure, the removal operation is 0(1). 
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Method removeElementTemp(Z e N, w s R) at class IndexedLinkedHistogram 

begin 

R <- A, ; 

^countTemp * ^countTemp — 1 r 



R 



weightTemp 



R, 



jightTemp — W 



I* determine the node to remove (in O(l)) */ 
/* Update weights */ 
/* if we remove above median, update weight above median */ 
/* if we remove below median, update weight below median */ 



if lvalue > M va i U e then 

| w a ^W. d -w; 
else if fl value < M value then 

| W b <- W h - w ; 
end 

d <— d — |M va i ue — lvalue I j /* update deviation */ 

updateMedianDeviationTempO ; /* The removal of a weight makes a shift of the median pointer necessary, cf. page 17 */ 

if #count = then 

| L.removeTemp(7?) ; /* remove R temporary from L (in 0(1)) */ 

end 



end 



Along with the weight decrement, the median pointer M needs to be shifted and the median deviation d has to 
be updated as follows, cf. updateMedianDeviationTempO. Initially, M points to the histogram node associated 
with the median yu^,.] for the data (//, . . . , f r ). In that state, the weights below and above the median, 



w b = 2 Wi 

[i=l,...,r. fi<im,,ii 



and 



(i=Z,...,r:/i>rt W ! 



are both smaller than half of the total weight W lo , = w i- After the temporary removal of the weight wi, this 
balance condition may be violated. In consequence, the median pointer M has to be shifted such that the balance 
of the weights above and below the median is recovered. Along with each shift of the median pointer, we update 
the median deviation d and the weights above and below the median. It is clear that one shifting and update step 
has constant runtime. Now let us check how many of these shifting steps are necessary to recover the balance 
condition. If the weights Wi i = /,..., r are all equal, i.e. w, = w/n, then the removal of the weight Wi decreases 
the masses Wb and W a by at most w/n. In consequence, the balance condition is restored by at most one median 
shift. For later use, we record the following. 



Equal weights: Updating the median fJ.[i+\ 7 r] is achieved by at most one shift. 



(43) 



If the weights w, are not equal, then the removal of the weight w\ may produce a more unbalanced situation. Here 
the decrement of the masses Wb and W a can only be bounded by the maximum weight max, w,. Furthermore, we 
can only guarantee that a shift of the median pointer amounts to at least changing the corresponding masses by 
min, Wi. Therefore, we only can confine the number of shifts by the smallest integer larger than max, w,/min, w,. 
We record: 



Updating the median //[/+!,,] is achieved by at most 



max/ w i 



min/ Wi 



pointer shifts. 



(44) 



Here the symbol \x \ denotes the smallest integer bounding x from above. 

Last but not least, some comments are necessary on how to update the approximation error c/[/+i, r ] defined 
by (40). At first, the removal of a weight Wi associated with the value /; decreases the current deviation d 
by wi\fi - Then, every median shifting step additionally decreases the current deviation d by the current 
difference between weight above and weight below times the difference between the old and the new median 
pointer. So to update the deviation one needs as many steps as pointer shifts to update the median. We 

record: 



Updating the deviation t%+i >r ] is achieved by at most 



max,- Wi 
min,- Wi 



steps. 



(45) 
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For the exact update procedure, we refer to the pseudo-code of updateMedianDeviationTemp(). 



Method updateMedianDeviationTemp( ) at class IndexedLinkedHistogram 



local : Total weight W m e R. 



begin 




*/ 
*/ 



/* while the weight above the median is larger than half of the total weight, we shift the median pointer upwards 



while W a > do 



H <— M va lue ; /* store current median value 

W h «- W h + M wei ghtTemp I /* update weight below median 

M < — M nex tTemp \ I* shift median pointer upwards 

d <— d — \fl - M va i U e| ■ \W a - Wb\ ; /* update deviation 

W a <— W- d — M we ightTemp \ I* update weight above median 



*/ 
*/ 
*/ 
*/ 
*/ 



end 



/* while the weight below the median is larger than half of the total weight, we shift the median pointer downwards 



*/ 



while W h > do 



yU < — M V alue i /* store current median value 

W. d <— W- d + M we ightTemp > /* update weight above median 

M <— MpjevTemp \ I* shift median pointer downwards 

d <— d - \fi - Mvaluel ' \Wb - W & \ ; /* update deviation 

W b <- W h - M wei ghtTemp '•> /* update weight below median 



*/ 
*/ 
*/ 
*/ 
*/ 



end 



end 



Implementational aspects For the sake of compactness and simplicity, we described the fast L -Potts algorithm 
in a simplified form. However, in the practical implementation, we can save several recurrences of the histogram 
list by updating the median and the median deviation after every insertion of a new element. For details we refer 
to our implementation [11]. 

If the weights have extreme different orders of magnitude, loss of significance may occur. As a consequence, 
the median pointer could be shifted beyond the linked list during the iteration, resulting in null pointer exceptions. 
We solve this by extra null-checks in the median update. 

For portability reasons, the core Potts algorithm is written in Matlab and the indexed linked histogram struc- 
tures in Java. So there is room for further speed-up by an implementation in hardware-near languages such as 



Remark 1. We shortly comment on the connection of the median computations in the L -Potts algorithm to the 
median filter. A median filter computes the medians over the partial vectors (f,, fi+ m ) for i = 1, n - m where 
m is a fixed window size. The L'-Potts on the other hand requires the medians of the intervals (//, ...,/, ) for every 
1=1, r, and every r = 1, n. So the methods of fast median filtering, as e.g. in [21], cannot be applied here 
directly. 

Theorem on time and space complexity. So far we have explained our algorithm. Since our result on time and 
space complexity is particularly appealing in the case where all weights are equal, we consider this case first. It 
was already formulated as Theorem A in the introduction. 

Theorem 6. The proposed L l -Potts algorithm computes an exact minimizer of the discrete ( unweighted) L l -Potts 
functional (37) within 0(n 2 ) time and 0(n) space. 

This theorem is a special case of Theorem 7 which deals with non-equal weights w,. Non-equal weights 
become important when we consider non-equidistant sampling in a continuous model as in Section 2. We recall 
that the continuous Potts functional (1) is given by 



C. 



P 7 (u) = yJ(u) + \\u-f\U -» 



mm . 



(46) 
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10 2 10 3 10 4 10 2 10 3 10 4 

Length of signal n Length of signal n 

Figure 3: Left: Runtimes of L 1 and L 2 -Potts algorithm on a logarithmic scale. Right: Relative runtimes. For large signals, the L 1 -Potts is 
nearly as fast as the L 2 -Potts. For smaller signals, the runtimes of the L 1 -Potts is up to 2 times higher compared to the L 2 case. This 
is mainly due to the overhead of the histogram structure. The experiments were conducted on an Apple MacBook Pro, with Intel 
Core 2 Duo 2.66 GHz and 8 GB RAM, using our implementation available on [1 1]. 



On the k-th level, it is approximated by the discrete functionals (7) given by 

jy ■ /(«) + \\u - ZjStfU) ■ hi , if u e PC* [0, 1] 



else. 



(47) 



As in Section 2 Stf is a sampling of /; for example Stfij) might be the integral on some member interval 
Ij = 7® of a partition of [0, 1] on level k. We identify the functions u e PC*[0, 1], which only jump in by 

definition, (cf. Section 2) with vectors u in R", and let w~ k) = A(7- ) be the length of the interval /,. Then, (47) 

1 j J 

reads 



yj(u) + 2_ Ji Wi\ui - (S k f)i\ = yJiu) + \\u - S k f\\ t i -> min 



(48) 



which is a weighted discrete L l -Potts problem of type (38). If the weights behave well, we have the following 
statement on the complexity of the algorithm if the mesh size goes to 0. 



Theorem 7. Consider a sequence of weight vectors w w , . . . where each weight vector w w = (wf\ , w®) is 
of length n^ and the sequence increases. We assume that 



max,- w 



mm, w 



(k) 



< c 



with C independent ofk. 



(49) 



Then the proposed L l -Potts algorithm computes an exact minimizer of the discrete weighted L l -Potts functional 
within 0(nh time and Oin^) space as k increases. 

So the asymptotic time and space complexity of our L 1 -Potts algorithm equals that of the L 2 -Potts algorithm 
proposed by F. Friedrich et al. [17]. Remarkably, our algorithm is not only asymptotically of comparable runtime. 
The actual runtime of the L l -Potts algorithm is only about 20% higher than that of the L 2 -Potts algorithm, for 
reasonably large signals; we refer to Figure 3 for a comparison of runtimes. 

We show that the assumption (49) in Theorem 7 cannot be dropped. We consider data f and weights w,- given 
by the following table. 



i 


1 


2 


3 


4 




n 


fi 


1 


n 


2 


n- 1 




n/2 


Wi 


1/2 


1/4 


1/8 


1/16 




l/2" +1 



We see that, for updating the median in the inner /-loop, our algorithm has to shift the median pointer from the 
total left to the total right and vice versa, respectively. Thus the complexity of the median updating is of order n, 
and the algorithm does not have the claimed complexity. 
We conclude with the proof of Theorem 7. 
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(a) Laplacian noise, cr = 0.2. 
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(c) L 2 -Potts. 
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(b) L'-TV. 
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(d)!. 1 -Potts. 
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Figure 4: Robustness of the L 1 -Potts to Laplacian noise of noise level cr = 0.2 (ground truth: dashed line.) We see that the L 1 -TV reconstruc- 
tion yields only an approximately piecewise constant reconstruction, The L? -Potts reconstructions is not robust to strong outliers 
and does not hit all plateaus. The L'-Potts reconstructs the original piecewise constant signal almost perfectly, with only slight 
errors in the heights. 



Proof of Theorem 7. We begin with the time complexity. In the paragraph on the Potts core algorithm we have 
seen that the Potts core algorithm needs (9(n 2 ) operations. Since we assume (49), the medians fxg^ in the inner 
/-loop of findBestPartitionLl can be updated within (9(1); cf. (45). The same is true for the deviations d\i^ 
given by (40); cf. (44). This guarantees that the operations in the inner /-loop of findBestPartitionLl are of 
constant complexity (9(1). Since all operations in the outer r-loop are of complexity (9(n), we find that findBest- 
PartitionLl has time complexity of (9(n~) as explained subsequent to (40). Further, the reconstruction algorithm 
reconstructionFromPartitionL 1 is in (9(«log«) and thus uncritical. Overall, the whole algorithm requires <9(n 2 ) 
operations. 

Concerning space complexity we first note that the Potts core algorithm requires (9(h) space as explained in 
the paragraph on the Potts core algorithm. The only members of the class IndexedLinkedHistogram which need 
more than (9(1) space are "SortedLinkedList of HistNode L" and "Array of HistNode pointers A". The linked 
list L consists of at most n histogram nodes and the array A of at most n pointers. So the overall space needed 
for data structures is 0{n). So it remains to show that the space needed to provide the deviations d[i/\ given by 
(40) and the corresponding Potts values P y (v') from the paragraph on the Potts core algorithm is 0{n). To see 
this, note that the deviations cfy ,] are not precomputed (which would amount to (9(n 2 ) memory). Instead, their 
computation is interweaved into the findBestPartitionLl procedure; So, in each iteration of the inner Hoop of 
findBestPartitionLl, only the real value dy tr ] is computed and immediately reprocessed to find the Potts value 
P y {v r ). After this step, the value d^] is no longer needed. Hence, only (9(1) space is needed for the storage 
of the deviations. That in situ computation is possible because our data structure is adapted to the structure of 
findBestPartitionLl. Further, the storage needed for the Potts values is <9(n). All other temporary variables have 
a memory consumption of (9(1). Altogether, our algorithm has space complexity (9(h). 

It remains to show that our algorithm indeed provides a minimizer of the Potts problem. Examining the proof 
of [17, Theorem 2] shows that it also applies to the weighted L'-Potts problem, and thus, our algorithm actually 
computes a minimizer of the weighted L'-Potts functional. □ 
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(a) Salt and pepper noise, 40% of data points lost. 
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(b) L'-TV. 
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(d)L 1 -Potts. 
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Figure 5: Under salt and pepper noise, the L 2 -Potts hits only some of the right jumps and fails to reconstruct the right heights. The L -TV 
reconstruction, gets some of the plateaus perfectly but several outliers remain. The L 1 -Potts hits all the plateaus almost perfectly, 
only small shifts in the jump locations are visible. 



3.2. Numerical experiments of the L 1 -Potts functional We compare the reconstruction of piecewise constant 
functions by the L 1 -Potts functional with that of the L 2 -Potts functional and the L'-TV functional under different 
types of noise. For the L'-TV minimization experiments, we used the implementation of [9]. 

Our first experiment deals with additive Laplacian noise; see Figure 4. There, every data point is corrupted by 
a random variable of the Laplacian probability density of variance a 2 

1 _ vs. , 

p(x) = e * 1 , where o~ > 0. 

V2o- 

Next, we look at salt and pepper type noise; see Figure 5. Here, a certain fraction p of a true signal g is set to a 
random variable 77, that is, the data is given by 




gi, with probability 1 — p, 
j], with probability p. 



In our experiments, r\ is distributed uniformly in the interval [0,1]. At last we compare the reconstruction methods 
under Gaussian noise of variance cr 2 , see Figure 6. We observe that in all three experiments, the minimizers of 
the L 1 -Potts functional come closest to the ground truth. Only in the case of Gaussian noise, the l? and the 
L 1 -Potts have approximately equal performance. 



4 Robustness of the L -Potts functional to mildly blurred data 

In this section, we consider the L 1 -Potts functional for blurred data /, which is given by the convolution of a 
piecewise constant signal g with a convolution kernel K, i.e., 

arg min P y (u) — yJ(u) + \\u - f\\\, where f = K * g. 

& «ePC[0,l] 7 ' J J b 

For the rest of the section, we assume that K is a symmetric, positive, and compactly supported convolution 
kernel which is strictly positive in a neighborhood of the origin. 
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(a) Gaussian noise, cr = 0.1. 
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(c) Z/-Potts. 
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(b) L'-TV. 
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(d)L 1 -Potts. 
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Figure 6: In presence of Gaussian noise of noise level cr = 0.1, both the L l - and the L 2 -Potts algorithm give almost perfect reconstructions. 
The L'-TV reconstruction does not match all the plateaus. 



We show Theorem C which is a statement on the blind deconvolution capability of the above L l -Potts func- 
tional. This theorem consists of a continuous and a discrete part. The continuous part of Theorem C is formulated 
as Theorem 1 1 and states the following. There are Potts parameters y > and a maximal kernel size k > such 
that for all convolution kernel K with smaller support size 



g- inf P 7 (u)= inf yJ(u) + \\u - K * g\\\ 



wePC[0,l 



KePC[0,l 



(50) 



This means that the corresponding Potts functional P y reconstructs the true signal g from the blurred data /. 
Here, it is not necessary to know the kernel explicitly as long as its support size is sufficiently small. If we are 
given sampled data, a minimizer of the discrete L 1 -Potts is equal to g up to a shift of jump locations of at most 
half the sampling density. This is essentially the second part of Theorem C, which is formulated as Theorem 12. 

Before delving into the proofs, we illustrate the robustness to convolution in Figure 7. Indeed, we see that 
a minimizer of the L 1 -Potts restores the original signal from blurred data. In contrast, L'-TV just returns the 
blurred signal, and minimizers of the L 2 -Potts have additional plateaus at the large jumps of the signal. 

We have seen in the introductory example Figure 1 that L l -Potts solutions are still robust if the blurred signal 
is additionally corrupted by noise. There, we also observed that the quality of the reconstructions hardly depends 
on the type of noise. In particular, the L l -Potts functional copes with Gaussian, Laplacian, and salt and pepper 
noise. 

In order to show (50), we first need a series of lemmas. We denote the jump locations of the piecewise constant 
function g by < j\ < ... < j„ < 1 and let Z m ; n be the minimal interval length given by this partition; see Figure 8 
for an illustration. For positive k < = we consider centered intervals J, around the jump location j, given by 



Ji = Ui-K,ji+K]. 

We denote the complementary intervals Nj lying in between the jump locations given by 

Ni = (ji-l +K,ji-K). 

On the left boundary, we set N\ = [0, jfi) and on the right boundary A^„+i = (J n + k, 1]. 



(51) 



(52) 
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(a) Signal blurred by moving average of size 11. 




1 - 
0.8 - 
0.6 - 
0.4 - 
0.2 - 





128 
(b) L'-TV. 



256 



1 

0.8 - 
0.6 - 
0.4 - 
0.2 - 




128 
(d)L 1 -Potts. 
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Figure 7: Reconstruction of a blurred synthetic signal / = K * g by the L -TV method results in the blurred signal itself, The solution of 
the Lr -Potts functional does not reconstruct all the plateaus of the original signal, but adds extra plateaus at the larger steps. The 
minimizer of the L' -Potts, on the other hand, recovers the piecewise constant signal g perfectly, as stated in Theorem 11. The 
comparison with the other methods shows that the deconvolution property is indeed a unique feature of the L 1 -Potts functional. 



Lemma 8. For any uq e PC[0, 1], there is u\ e PC[0, 1] such that u\ has no jump in the union {JjNi and such 
that P y (u\) < Py(uo). Thus, 

min P y (u) = min{P y (i<) : u e PC[0, 1]; u has only jumps in /,}. 

Proof. Consider an arbitrary index i and assume that U(, jumps in N,. We first consider the case where m has more 
than one jump in N,. In this case, we define u\ — g on A^,- and u\ - uq elsewhere. Then we have J(u\) < J(uq) 
and \\g — Mi||j < \\g — Molli • Hence P y {u\) < P 7 {uq) which shows the lemma in this case. 

It remains to prove the case where uq has exactly one jump in Nu say at x € Af, = (I, r). To this end we compare 
the value di = \g— Uq\ on (/, x) with the value d r — \g - uq\ on (x, r). If di > d r we shift the jump position to the 
left, that means, we let u\ = uq(x + ) on A^,- and u\ = uq elsewhere. The number of jumps stays constant and 

IL? - "illi = IIC? - «o) Lvf Hi + Wis - «i) k Hi ^ 11^ - «olli ■ 

If d[ < d r , we define u\ = uq(x~) on A^, and u\ = uq elsewhere. Then, an analogous argument completes the 
proof. □ 

Lemma 9. Let u\ e PC[0, 1] with no jumps in U/M- If 

6/l max /C < 5/l m in(/min ~ 2k) - J, (53) 

then there is U2 € PC[0, 1] such that Pyiu.2) < P y (u\) and such that ui has at least one jump in each J,. 

Proof. Assume that U\ has no jump in 7, for some i'o. Then we find integers k > and s > 1 such that u\ is 
constant on N; a -k U Jj -u U . . . U 7, 0+J -i U Ni 0+S and jumps in J^-k-i and 7, 0+s . We denote by x\ the rightmost jump 
point of u\ in Jj -k-i an d by x r the leftmost jump point of U\ in 7, 0+s . We define u-i by 



U2 = 



\g, on(xi,x r ), 

I Mi, outside (xi,x r ). 
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Figure 8: Illustration of the minimum interval length / m j n , the maximum jump height /i max , and the minimum jump height h m i n of a piecewise 
constant function /. 



Obviously, u% has at least one jump in each for i = z'o — k — 1, . . . , z'o + s. Furthermore, we have 

J(u 2 ) = /(mi) + k + s. (54) 

Now we estimate the deviation of u<x to / = K * g on (xi, x r ) to obtain 

||(ii2 - K * g)\ {xuXt) \\\ = life ~ K * g)\(_ Xl , Xr) h <(k+s + 2)/ W 2* < 6(k + s)h max K. (55) 

This is because g equals K * g on each Nj and \\(g - K * g)|/J|i is bounded by the jump height of g in /, times the 
length of Jj which in turn equals 2k by definition. 

In order to estimate the deviation of u\ \oK*g on (xi, x r ) we need the following preparations. Let us consider 
a data sequence go, ...,g n with corresponding median m. For two consecutive members gj and gj +1 we see that 
\gj - m\ + \g j+ i - m\ > min, \g t - g M \. This implies 



which shows that, for all y e 



Y^" =0 \gi ~y\>'i min \ gi - g M \. 



(56) 



Now we come back to the estimate of ||(«i — K * g)| (*;,;<:,) 111- Recalling that u\ is constant on (xi,x r ) and that 
f = K * g equals g on each Nj, we see that 



! +i 



!„+S 



(m-£*£)ta».*)||i > J] ll(H j -X'*g)lj V( ||i >min|M| J] |y- ft | 



(57) 



i=io-k 



where y is the function value of u\ on (xi, x r ) and g, is the function value of g — K * g on Nj, where g is constant 
on. We employ (56) to get 



V . ,\y-gi\ > \(k + s)mm\gi- g i+x \ > \(k + s)h min . 
Plugging (58) into (57) and using Nj > l m i n - 2k we get 



(58) 



||(Mi - A" * g\ XbXr) h > (/min - 2k)^/w (59) 

Next we set up a lower bound for ||«2 — K * f\\\ in terms of \\u\ - K * /||] . To this end, we first restrict to the 
interval (xi, x r ) and obtain 

\\(u 2 - K * g)| (x ,, Xf) ||i < (fc + s)6h max K 

<(k+ s) (\h min (l min - 2k) - yj 

<\\(u 1 -K*g)\ (x , Xr) \\ l -(k + s)y. (60) 
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For the first inequality we used (55), for the second the assumption of the claim (53), and for the last inequality 
(59). Since u 2 equals u\ outside (x/,x r ), inequality (60) implies that 

IN ~K* = ||(mi - K * g)|[o,i]\(x, ; .v,)lli + ll(«2 ~K* g)\( Xl , Xr) \\i 

< H(ki - K * g)| [0 ,i]\(A7,. t ,)lli + ll("i - K * g)| ( ^ r )lli -(k+ s)y 
= \\(u 1 -K*g)\\ 1 -(k+s)y. 

and therefore, using (54), the assertion follows by 

P y (u 2 ) = J(u 2 ) + \\(u 2 - K * g)h < -/(«i) + 7(k +s) + \\(ui - K * g)|h - y(k + s) = P y (u\). a 

Lemma 10. Let u 2 e PC[0, 1] and assume that (53) holds. Then there is u^ e PC[0, 1] such that uj has no jumps 
in [JiNi and precisely one jump in each and 

P 7 (u 3 ) < P 7 (u 2 ), if2Kh m . dx <y. (61) 

Here, h max is the maximal jump height of g. 

Proof. By Lemma 8 we may assume that u 2 has no jump in U; N, and, by Lemma 9, that u 2 has at least one jump 
in each . 

We consider Ni and find u' such that u' agrees with g on Nj for all i and such that P y (u') < P y (u 2 ). To this end, 
we denote by the leftmost jump point of u 2 in and by r, the rightmost jump point of u 2 in . We define 

, (gi, on (/,-, r,), 

u — < 

\u 2 , else. 

where gi is the value of g on A 7 ,-. The number of jumps of u' coincides with that of u 2 . We first observe that 

Wig - U2)\N,\U ^ (tain ~ 2*)\gi ~ (u 2 )i\, 

since \N,\ > l m [ n - 2k. Using this estimate and noting that \\(g - w')liVjlIi = we obtain 

W(g-U')kk,r t )\h =Kg-U% k ,r tm h 

<Wig ~ K2)lc/,,r,)Willl + ll("2 - U%i hri)Wi \U 

+ Wig - U 2 )\ Nl W\ - (tain ~ 2K)\gi - (U 2 )i\. (62) 

Since u' = gi on (/,-, r t ) and \(U, r,) \ Ni\ < Ak we have 

ll(«2 - u')\Q urd \ Ni W\ < Mgi - (u 2 )i\. 

Plugging this into (62) we get 

Wig ~ u ')\(.h,rd\\l < Wig - «2)l(/„r,)lll + (6* - lmin)\gi ~ iu 2 )i\ < Wig ~ «2)l(^)IIl, 

since we assume that k < ,J ^. Therefore, P y (u') < P 7 (u 2 ). 

Now we define the function uj and show that its Potts-value P y {u^) < P y (u'). To that end, we consider the k 
intervals /, = [/,, r,] were the u' has more than one jump and define 



Uj 



g(l7), on(/ ; ,^), 
g(rf), on(^,r,), 
else. 



Then, J(uj) < J(u') -k. Furthermore, on each such interval J - [I, r], where u' has more than one jump, we have 
that 

WiK * g - U3 )\j\h < WiK * g - u%Wi + 2^ max . 
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Summing over all the k intervals where u' jumps more than once we get 

P y (u 3 ) = yJ(u 3 ) + MK*g-u 3 )\\ l 

< y(J(u') -k) + \\(K * g - u')\h + k • 2/c/w 
= P y (u) + k{2 K h mm -y)< P y {u'X 

since by hypothesis 2Ar/z max < y. Summing up, we have found a function M3 with P y (uj,) < P y (u') < P y (u2), which 
has exactly one jump in each interval □ 

Recall that the minimal and maximal jump height of g are denoted by h m i a and /i max , respectively. Furthermore, 
Z m in is the minimal interval length between two jumps of g; see Figure 8. 

Theorem 11. Let g be a piecewise constant function on [0, 1]. If the support size k of the convolution kernel K 
satisfies 

h ■ I ■ 

' £ min'min 

k < (63) 

2(8/! max + /l m i n ) 

then g is the unique minimizer of the Potts functional P y associated with f — K * g, i.e., 

^^/rW^^y^ + lk.-Zlli. (64) 
for any Potts parameter y satisfying 

2Kh m:lx < y < \h min l mia - (h min + 6/i max ) k. (65) 

Proof. First notice that (63) implies k < ^/ m i n , which we assumed at the beginning of this section (and was 
implicitly assumed in the preceding lemmas.) 
By (63) we get 

2^/z max < 2^ m i n ^ m i n — C^min 6/^rnax) 

which allows us to choose y according to (65). Manipulating the righthand inequality of (65) yields that 

6/lmax* < 5^min(^min ~ 2k) - y. (66) 

This is precisely the assumption on k and y in Lemma 9. Furthermore, the lefthand inequality of (65) ensures 
that Lemma 10 is applicable. Now Lemma 8 allows us to search the minimizer of P y , cf. (64), in the class of 
functions of piecewise constant functions without jumps in UiM- Then, by Lemma 9 and Lemma 10, we may 
restrict this class further to those functions having precisely one jump in each interval i.e., 

inf P y (u) - inf{P y (t<) : u jumps in no N, and precisely once in each /,■}. (67) 

w€PC[0,l] 

Now we consider such a function u which is a member of the set defined by the righthand side of (67) and assume 
that u does not equal g identically. Let us denote the jump locations of u by X\ , x„ with x,- e 7,-. Without loss of 
generality, we may assume that u coincides with g on each Ni, since otherwise, redefining u - gi on (x,, x, + i) (and 
probably also on the boundary intervals) yields a smaller P y value (cf. beginning of the proof of Lemma 10). 

Since u does not equal g, we find at least one interval 7, where the jump locations f of g and the corresponding 
jump location x,- of u do not coincide. We show that 

||(H-jr**)| /( ||i >llCg-***)Ulli (68) 

on each 7, where the jumps of g and u do not coincide. This implies, as g and u have the same number of jumps, 

that P y {u) > Py(g). 

To complete the proof it remains to show (68). To this end, we denote the lefthand and righthand boundary 
of Jj by / and r, respectively. We note that g(l) = u(r), g(r) = u(r + ), and g jumps at 1(1 + r) - f. Recall that, 
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by assumption, the jump location x, of u fulfills x-, + j t . The symmetry of K implies that the mass of K on the 
negative half axis equals that on the positive half axis, which in turn yields 



K*g(±(l + r))=\(g( r )+g(D). (69) 

Since K is positive, K * g is monotone on . Together with the above equation (69), this implies 

HQ? - K * g)\jh < \W) - g(l)\ ■ \J\, for any J c (70) 

Furthermore, as we assume that K > in a neighborhood of the origin, K * g is strictly monotone in a neighbor- 
hood of jj. This shows that inequality (70) is strict, i.e., we may replace the symbol "<" in (70) by "<". On the 
interval /' which is given by the endpoints x, and y, it holds that \u(x) - K * g(x)\ > \{g(f) - g(l)), for almost all 
x. Therefore, 

\\(K * g - u)Wh > \\g{r) - g{l)\ ■ > \\{K * g - g)\ r \\ x . 
As u and g are equal outside /' we have shown (68), which completes the proof. □ 

The next statement is the discrete part of Theorem C. 
Theorem 12. Let g be a piecewise constant function on [0, 1] and let k be the support size of the convolution 

ok 

y 



kernel K. We recall that P k is the discrete Potts functional given by (7) and is associated with tlie discretization 



set X k = {x\, . . . , X n ] of mesh size r\ - max,- — Xi-\\. We assume that 

^min^min , n1 v 

K + T]< . (71) 

2(8/i max + /l m in) 

Let g k — Yii a !'l(s;,si+i) be a "X k -discretization" of g — 2/ <*;l(«i,f J+I > where Sj e X k is a nearest neighbor of the 
jump location f; of g. Furthermore, we consider data f k — YijS kfU) ' 1/ obtained from sampling f — K * g on 
the discretization set X k using (5) or (6). Then each function g k is a minimizer of the discrete Potts functional P k 
associated with data f k , i.e., 

P*(«*)= inf P*(m)= inf 7 J(u) + \\u-f k \\ u (72) 

«£PC'[0,1] «ePC*[0,l] 

for any Potts parameter y satisfying 

2(k + J])h max <y< \h min l min - (h min + 6/i max ) {k + n). (73) 

The X k -discretizations g k of the above type are the only minimizers, hence the minimizer is unique whenever each 
jump location tj of g has a unique nearest neighbor in X k . 

Furthermore, ifX k is a nested sequence of discretization sets being eventually dense, each sequence of mini- 
mizers gk converges to g. 

Proof. The proof of Theorem 12 uses arguments anologous to those in the proof of Theorem 11. In particular, 
the statements and proofs of Lemma 8, Lemma 9, and Lemma 10 remain true in the setup of Theorem 12 if 
we replace each occurence of k by k + rj in these lemmata, their proofs and in the defintion of jV, and 7, in (51) 
and (52), respectively. Therefore, it is sufficient to show (72) for the functions u in PC*[0, 1] which jump in no 
Af, and precisely once in each . Thus, let us consider a function u, which we may assume, as in the proof of 
Theorem 11, to coincide with g on each N,. We show the following. If there is an interval J, around a jump 
location f, of g and if u does not jump on a nearest neighbor of f, in X k n 7,, then 

IK" - /*)Ulli > IK/ - and ||(/ - f k )\j.^ > ||(/ - f k )\j.\\ u (74) 

for any X k ^-discretization ~g k . Since the number of jumps of u, g k and ~g k are equal, (74) yields (72). 

In order to show (74), we first note that g jumps at f,-. Furthermore, by the proof of Theorem 11, /(?,) = K*g(ti) 
- MsitJ) + Si^tJ) -'■ a an d / i s monotone and strictly monotone in a neighborhood of So, if f, has two 
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(a) Signal blurred by a Gaussian kernel (b) and corrupted by 
Laplacian noise of <x = 0.1. 
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(c) K-L 1 -TV. 
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(b) Gaussian convolution kernel e~(°- 2,1 > (normalized). 
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Figure 9: Reconstruction of a signal blurred signal by a (normalized) Gaussian kernel (b) and corrupted by Laplacian noise. The L 2 penalized 
Potts reconstructions (c, d) miss one jump, but add extra jumps at outliers. The minimizer of the i'-TV associated deconvolution 
functional hits all plateaus correctly, but still has some outliers and oscillations. The proposed L 1 -Potts deconvolution strategy 
reconstructs the original signal almost perfectly. 



nearest neighbors I, r in X k , then the symmetry of K implies that f k = a — \(g(tj) + g(tt)) on the interval 
(I, r). Furthermore, f k is monotone on 7, and jumps at I and r. Then, if u does not jump at / or r, \u - f k \ > b 
:= \\g{tj) — g(tf)\ on the interval connecting the jump of u in and / or r depending on which of both is closer to 
the jump of u. In contrast, since g k and g 1 jump at / or r we have that \g k - f k \ < b and that \g k - f k \ — \g k - f k \ 
on So we have shown (74) in the case where f, has two nearest neighbors. 

If f, has only one nearest neighbor S; in X k , then on the one side (left or right) of we have fa > a, and on the 
other side (right or left) of s,, we have < a. Now, as above, if u does not jump at S; we have \u - f k \ > b on the 
interval connecting the jump of u and Since g k — u outside this interval and since \g k - f k \ < b on 7,, we have 
shown (74) in the case where f, has only one nearest neighbor. 

The last assertion of the theorem is a consequence of the fact that the mesh sizes of X k tend to as k — > oo. □ 

We note that, although the statement of the theorem holds for both kinds of sampling operators introduced in 
(5) and (6), it is more natural to consider point sampling here since averaging might be already modeled by the 
kernel K. 

5 Reconstruction of highly blurred piecewise constant signals 

In the previous section we have seen that the L l -Potts minimization is robust to mildly blurred data. In order 
to deal with more severely blurred data, we propose a heuristic splitting approach to tackle the deconvolution 
problem 

yj(u) + \\K*u- /Hi -> min. (K-L'-Potts) 

In contrast to the previous section, this requires to know the source of the blurring, that is, the convolution kernel 
K. We notice that (K-L 1 -Potts) has a solution in a discrete setting, that is for f,u € R". We provide the proof of 
this fact in a forthcoming work [10]. 
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We rewrite (K-L 1 -Potts) as the constrained optimization problem in two variables 

min yJ(u) + \\K * v - subject to u - v — 0. (75) 

We add the constraint u - v as the extra penalty fi\\u - v||i to the target functional, which yields the unconstrained 
optimization problem 

yj(u) +n\\u- vlh +\\K*v- /||, -» min . (76) 

By the extra parameter p > 0, we can control how strongly u and v are tied to each other. 

We approach the bivariate optimization problem (76) by the following heuristic. We first fix the variable v. 
Then minimizing (76) is equivalent to minimizing the L'-Potts function with data v and jump penalty i.e., 

min 2/( m ) + ||k-v||i. (77) 

U r 

Then, we fix u and optimize (76) with respect to the variable v. By the substitution w — u - v, this reads 

min n\\w\h + \\K* w - (K * u - f)\\ u (78) 

w 

which is an K-L l -L l deconvolution problem in w with respect to the data K *u — f. For the minimization of the 
K-L l -L l problem (78) we use the algorithm proposed in [1], alternative approaches can be found in [18, 8, 19]. 
To summarize, we alternately minimize the Potts problem (77) and the K-L l -L l problem (78), until we reach 
some stopping criterion. 

If the coupling parameter p is sufficiently large, we expect that the solutions of the split functional (76) are 
close to the solutions of the original problem (K-L 1 -Potts). However, if fi is too large, then the alternating 
minimization steps (77) and (78) have little freedom, so the iteration is likely is to get stuck. Therefore, we start 
the iteration with a small coupling parameter and successively increase it during the iteration. The complete 
algorithm is outlined in the procedure minKLlPottsSplit. 



Procedure minKLlPottsSplit(/ e R", y e R + , /jel + Je R ffl ) e 



input : Data / el", Potts parameter y > 0, initial coupling parameter ju > 0, convolution kernel K e R m . 
output: Piecewise constant signal u e R". 
local : Auxiliary variable w e R". 
begin 

V < — f ; /* initialize v by data / */ 

repeat 

U <— minLlPotts(v, ^); /* solve the L'-Potts problem (77) */ 

W <— minKLlLl(/T * U - f,/J,K) ; /* solve the K-L l -L [ problem (78) w.r.t. to data K*u-f*/ 

V <— U — W ; /* resubstituting data */ 

3 

H <— 2 A* i /* increase coupling parameter */ 

until not reached stopping rule; 
end 



In Figure 9, we compare the proposed splitting algorithm for the L'-Potts deconvolution problem (K-L 1 -Potts) 
with the solutions of the corresponding L'-TV problem 

yllVwHt +\\K * u — /||i -> min (K-L'-TV) 

using the algorithm proposed in [9]. We observe that the computed minimizer of the L'-TV functional still 
contains noise, whereas the solution of our approach recovers the piecewise constant signal almost perfectly. 
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6 Conclusion and future research 



We have seen that the L 1 -Potts functional is particularly well suited to reconstruct piecewise constant functions 
under non-Gaussian noise. We established a continuous L 1 -Potts model with the feature that the limits of mini- 
mizers of its (possibly non-equidistant) discretizations are minimizers of the continuous model. For the solution 
of these discrete L 1 -Potts functionals, we have presented an exact and fast algorithm. 

We could show that the L 1 -Potts functional is robust to mild blurring. For strongly blurred signals and known 
source of blurring, we have proposed an algorithm to solve the associated L 1 -Potts deconvolution problem. One 
point of future research is to obtain analytic results for this heuristic approach. 

Another direction of future research are generalizations to higher dimensions. The direct generalization of the 
Potts functional to higher dimensions is NP-hard [2], thus computationally unamenable. However, the compu- 
tational load may be reduced by restricting the set of admissible partitions. One approach in that direction are 
wedgelets [13]. In the L 2 setting, a fast wedgelet algorithm has been obtained in [16]. Building on the ideas for 
the fast computations of medians in this paper, the authors were able to derive a fast algorithm for the compu- 
tation of L'-wedgelets. Its computational complexity lies in the order of (9(«logn), where n is the number of 
pixels. This result is the matter of a forthcoming paper. Here, one direction of further research is to find fast 
algorithms for a wider class of allowed partitions. 
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